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1 Basic notions and examples 

Of course various kinds of metric spaces arise in various contexts and are 
viewed in various ways. In this brief survey we hope to give some modest 
indications of this. In particular, we shall try to describe some basic examples 
which can be of interest. 

For the record, by a metric space we mean a nonempty set M together 
with a distance function d(x,y), which is a real- valued function on M x M 
such that d(x,y) > for all x,y G M, d(x,y) = if and only if x — y, 
d(x, y) = d(y, x) for all x, y G M, and 

(1.1) d(x, z) < d(x, y) + d(y, z) 

for all x,y,z G M. This last property is called the triangle inequality, and 
sometimes it is convenient to allow the weaker version 

(1.2) d(x,z)<C(d(x,y) + d(y,z)) 
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for a nonnegative real number C and all x,y,z G M, which which case 
(M,d(x,y)) is called a quasi-metric space. Another variant is that we may 
wish to allow d(x, y) — to hold sometimes without having x = y, in which 
case we have a semi-metric space, or a semi-quasi-metric space, as appropri- 
ate. 

A sequence of points {xj}'^ =1 in a metric space M with metric d(x,y) is 
said to converge to a point x in M if for every e > there is a positive integer 
L such that 

(1.3) d(xj,x) < e for all j > L, 
in which case we write 

(1.4) lim Xj = x. 

A sequence {xj}JL\ of points in M is said to be a Cauchy sequence if for 
every e > there is a positive integer L such that 

(1.5) d(xj,Xk) < e for all j, k > L. 

It is easy to see that every convergent sequence is a Cauchy sequence, and 
conversely a metric space in which every Cauchy sequence converges is said 
to be complete. 

A very basic example of a metric space is the real line R with its standard 
metric. Recall that if Ob IS db real number, then the absolute value of x is 
denoted \x\ and defined to be equal to x when x > and to —x when x < 0. 
One can check that 

(1.6) | a; + y\ < \x\ + \y\ 
and 

(1.7) \xy\ = \x\\y\ 

when x, y are real numbers, and that the standard distance function \x — y\ 
on R is indeed a metric. 

Let n be a positive integer, and let R™ denote the real vector space of 
n-tuples of real numbers. Thus elements x of R n are of the form (x\, . . . , x n ), 
where the n coordinates Xj, 1 < j < n, are real numbers. If x = (xi, . . . , x n ), 
y = (y 1 , . . . , y n ) are two elements of R™ and r is a real number, then the sum 
x + y and scalar product r x are defined coordinatewise in the usual manner, 
by 1 

(1.8) x + y = (x 1 +y 1 ,...,x n + y n ) 
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and 

(1.9) rx = (rxi, . . .,rx n ). 

If x is an element of R n , then the standard Euclidean norm of x is denoted 
\x\ and defined by 

/ n s 1/2 

(1.10) 1*1 = (E*?) • 

One can show that 

(1.11) | a; + y\ < \x\ + \y\ 

holds for all x, y G R™, and we shall come back to this in a moment, and 
clearly we also have that 

(1.12) \rx\ = \r\ \x\ 

for all r G R and x G R™, which is to say that the norm of a scalar product 
of a real number and an element of R n is equal to the product of the absolute 
value of the real number and the norm of the element of R n . Using these 
properties, one can check that the standard Euclidean distance \x — y\ on R™ 
is indeed a metric. 

More generally, a norm on R n is a real-valued function N(x) such that 
N(x) > for all x G R™, N(x) = if and only if x = 0, 

(1.13) N(rx) = \r\N(x) 
for all r G R and x G R n , and 

(1.14) N(x + y) < N(x) + N(y) 
for all 1,1/6 R n . If N(x) is a norm on R™, then 

(1.15) d(x,y) = N(x-y) 

defines a metric on R™. As for metrics, one can weaken the triangle inequality 
or relax the condition that N(x) = implies x = to get quasi-norms, semi- 
norms, and semi-quasi-norms. 

Recall that a subset E of R n is said to be convex if 

(1.16) tx + (l-t)y G E 

whenever x, y are elements of E and t is a real number such that < t < 1. 
A real- valued function f(x) on R ra is said to be convex if and only if 

(1.17) f(tx+ (l-t)y)<t f(x) + (l-t) f(y) 



3 



for all x, y G R n and t G R n with < t < 1. If iV(x) is a real-valued 
function on R n which is assumed to satisfy the conditions of a norm except 
for the triangle inequality, then one can check that the triangle inequality 
the convexity of the closed unit ball 

(1.18) {x G R n : N(x) < 1}, 

and the convexity of N(x) as a function on R n , are all equivalent. 

For example, if p is a real number such that 1 < p < oo, then define \x\ p 
for x G R n by 



(1.19) 




which is the same as the standard norm \x\ when p — 2. For p = oo let us 
set 

(1.20) \x\oo = max{\xj\ : 1 < j ' < n}. 

One can check that these define norms on R™, using the convexity of the 
function \r\ p on R when 1 < p < oo to check that the closed unit ball of \x\ p 
is convex and hence that the triangle inequality holds when 1 < p < oo. 

Let us now consider a class of metric spaces along the lines of Cantor 
sets. For this we assume that we are given a sequence {Fj}° ( L 1 of nonempty 
finite sets. We also assume that {pj}^ = i is a monotone decreasing sequence 
of positive real numbers which converges to 0. 

For our space M we take the Cartesian product of the F/s, so that en 
element x of M is a sequence {xj}^ such that Xj G Fj for all j. We define 
a distance function d(x, y) on M by setting d(x, y) — when x = y, and 

(1.21) d(x,y)= Pj 

when Xj ^ yj and Xi = y,i for all % < j. One can check that this does indeed 
define a metric space, and in fact d(x, y) is an ultra-metric, which is to say 
that 

(1.22) d(x,z) < m&x(d(x, y), d(y, z)) 

for all x,y,z G M. 

The classical Cantor set is the subset of the unit interval [0, 1] in the 
real line obtained by first removing the open subinterval (1/3,2/3), then 
removing the the open middle thirds of the two closed intervals which remain, 
and so on. Alternatively, the classical Cantor set can be described as the 
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set of real numbers t such that < t < 1 and t has ab expansion base 3 
whose coefficients are are either or 2. This set, equipped with the standard 
Euclidean metric, is very similar to the general situation just described with 
each Fj having two elements and with pj = 2~i for all j, although the metrics 
are not quite the same. 

In general, if (M,d(x,y)) is a metric space and E is a nonempty subset 
of M, then E can be considered as a metric space in its own right, using 
the restriction of the metric d(x, y) from M to E. Sometimes there may be 
another metric on E which is similar to the one inherited from the larger space 
M, and which has other nice properties, as in the case of Cantor sets just 
described. Another basic instance of this occurs with arcs in Euclidean spaces 
which are "snowflakes" , and which are similar to taking the unit interval [0, 1] 
in the real line with the metric \x — y\ a for some real number a, < a < 1, 
or other functions of the standard distance on [0,1]. 

A nonempty subset E of a metric space (M, d(x, y)) is said to be bounded 
if the real numbers d(x, y), x,y G E, are bounded, in which case the diameter 
of E is denoted diam E and defined by 

(1.23) diamE = sup{<i(x, y) : x, y G E}. 

A stronger condition is that E be totally bounded, which means that for each 
e > there is a finite collection A 1 , . . . , A k of subsets of E such that 

k 

(1.24) EQ[jA j 

j'=i 

and 

(1.25) d(x,y)<e x,y G Aj, 

j — 1, . . . , k. A basic feature of Euclidean spaces is that bounded subsets are 
totally bounded, and the generalized Cantor sets described before are totally 
bounded. 

A metric space (M,d(x,y)) is compact if it is complete, so that every 
Cauchy sequence converges, and totally bounded. This is equivalent to the 
standard definitions in terms of open coverings or the existence of limit points. 
A closed and bounded subset of R™ is compact, and the generalized Cantor 
sets described earlier are compact. 

Another way that metric spaces arise is to start with a connected smooth 
n-dimensional manifold M, which is basically a space which looks locally 
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like n-dimensional Euclidean space. At each point p in M one has an n- 
dimensional tangent space T P (M), which looks like R n as a vector space, 
and on which one can put a norm. If at each point p in M one can identify 
T P (M) with R™ with its standard norm, then the space is Riemannian, and 
with general norms the space is of Riemann-Finsler type. 

In this type of situation, the length of a nice path in M can be defined by 
integrating the infinitesimal lengths determined by the norms on the tangent 
spaces. The distance between two points is defined to be the infimum of the 
lengths of the paths connecting the two points. It is easy to see that this 
does indeed define a metric, with the triangle inequality being a consequence 
of the way that the distance is defined. 

A basic example of this is the n-dimensional sphere S n in R n+1 , defined 

by 

(1.26) S n = {x G R n+1 : \x\ = 1}. 

The tangent space of S n can be identified with the n-dimensional linear 
subspace of R n+1 of vectors x which are orthogonal to p as a vector itself. 
This leads to a Euclidean norm on the tangent spaces, inherited from the 
one on R n+1 . 

One can also define distances through paths in other situations. In a 
nonempty connected graph, every pair of points is connected by a path, the 
length of a path can be defined as the number of edges traversed, and the 
distance between two points can be defined as the length of the shortest 
path between the two points. In many standard fractals, like the Sierpinski 
gasket and carpet, there are a lot of paths of finite length between arbitrary 
elements of the fractal, and the infimum of the lengths of these paths defines 
a metric on the fractal. 

Let (M,d{x,y)) be a metric space. If A, B are nonempty subsets of M 
and t is a positive real number, then let us say that A, B are "t-close" if 
for every x G A there is a y G B such that d(x, y) < t, and if for every 
y G B there is an x G A such that d(x, y) < t. By definition, this relation is 
symmetric in A and B. 

A subset E of M is said to be closed if for every sequence {zj}^ of 
points in E which converges to a point z in M, we have that z G E. Let us 
write S(M) for the set of nonempty closed and bounded subsets of M. If A, 
B are two elements of S(M), then the Hausdorff distance between A and B 
is defined to be the infimum of the set of positive real numbers t such that 
A, B are t-close. 
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Because A and B are bounded subsets of M, there are positive real num- 
bers t such that A, B are t-close. The restriction to closed sets ensures that 
if A, B are t-close for all t > 0, then A = B. If Ai, A 2 , A3 are nonempty 
subsets of M and ti, t 2 are positive real numbers such that A±, A 2 are ti-close 
and A 2 , A 3 are t 2 -close, then one can check that A\, A 3 are (ti + t 2 )-close, 
and this implies the triangle inequality for the Hausdorff distance. 

From this it follows easily that S(M), equipped with the Hausdorff dis- 
tance D(A,B), is indeed a metric space. A basic result states that if M is 
compact, then S(M) is compact too. This is not too difficult to show. 

Now let us consider a situation with a lot of deep mathematical structure 
which has been much-studied, involving interplay between algebra, analysis, 
and geometry. 

Fix a positive integer n, which we may as well take to be at least 2. Let 
M.+ denote the set of n x n real symmetric matrices which are positive- 
definite and have determinant equal to 1. One can think of this as a smooth 
hypersurface in the vector space ofnxn real symmetric matrices, and it is 
a smooth manifold in particular. 

We can start with a Riemannian view of this space. For each H e 
we can identify the tangent space of M.+ a ^ H with the vector space ofnxn 
real symmetric matrices A such that the trace of H^ 1 A is equal to 0. Thus 
for each such A, we get a one-parameter family of perturbations of H by 
taking H + 1 A, where t is a real number with small absolute value, so that 
H + t A is still positive definite, and to first order in t these perturbations 
also have determinant equal to 1. 

Conversely, to first order in t, each smooth perturbation of H in M.+ is 
of this form. Now, for each A of this type, we define its norm as an element 
of the tangent space to M.\ to be 



Here tr B denotes the trace of a square matrix A, and we are using ordinary 
matrix multiplication in this expression. 

When H is the identity matrix, this reduces to the square root of tr A 2 , 
which is a kind of Euclidean norm of A. For general if's, we adapt the norm 
to H . It is still basically a Euclidean norm, so that we are in the Riemannian 
case. 

Let us consider the transformation on Ai^ defined by H 1— > H^ 1 . If H + 
t A is a basic first-order deformation of H, as before, then that is transformed 



(1.27) 
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to (H + 1 A) 1 , which is the same as 

(1.28) H- 1 -tH- 1 AH' 1 

to first order in t. Thus A as a tangent vector at H corresponds to —H^ 1 A H~ 
as a tangent vector at H^ 1 under the mapping H i— > if -1 , and it is easy to 
see that the norm of A as a tangent vector at H is equal to the norm of 
—H^ 1 AH^ 1 as a tangent vector at H L . 

Now let T be an n x n matrix with determinant equal to 1, and let T* 
denote its transpose, which is also an invertible n x n matrix. Associated to 
T is the mapping 

(1.29) H h-> T HT*, 

and if A corresponds to a tangent vector at H as before, then T AT* is the 
tangent vector at T HT* induced by our mapping. Again one can check that 
the norm of A as a tangent vector at H is the same as the norm of T AT* 
as a tangent vector at T HT*. 

Assume further that the entries of T are integers. This implies that the 
inverse of T also has integer entries, by Cramer's rule. The product of two 
such matrices has the same property, and indeed this defines a nice discrete 
group of matrices. 

This discrete group acts on M^, and we can pass to the corresponding 
quotient space. That is, we now identify two elements Hi, H 2 in if there 
is an integer matrix T as above so that 

(1.30) H 2 = THiT*. 

This relation between Hi,H 2 G M.^ is indeed an equivalence relation, so 
that we get a nice quotient. 

Because of the discreteness of the group of integer matrices with deter- 
minant 1, the quotient space is still a smooth manifold, since it looks locally 
like The norm on the tangent spaces still makes sense as well, because 

the transformations defining the equivalence relation preserves these norms, 
as we have seen. Thus this quotient of is still a nice smooth connected 
Riemannian manifold. 

In some situations like this the quotient space turns out to be compact. 
In this case the quotient space is not compact, but it does have finite vol- 
ume. More precisely, the notion of volume at the level of the tangent spaces 
is determined by the norm, and preserved in the present circumstances by 
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the transformations used in the equivalence relation, and the volume of the 
quotient can be obtained by integrating the infinitesimal volumes. 

The group of n x n matrices with integer entries and determinant 1 is a 
very interesting special case of discrete groups more generally. Suppose now 
that T is a group and that A is a finite symmetric subset of T which generates 
T, so that a^ 1 G A when a G A and every element of T can be expressed 
as a finite product of elements of A, with the identity element automatically 
corresponding to an empty product. This leads to the associated Cayley 
graph, in which two elements 71, 72 of T are considered to be adjacent if 72 
can be expressed as 71 a for some a G A, and to a distance function on T 
which is invariant under left-multiplication in the group. 



2 Aspects of analysis 

Let (M,d(x,y)) be a metric space, let f(x) be a real-valued function on M, 
and let C be a nonnegative real number. We say that / is C-Lipschitz if 

(2.1) \f(x)-f(y)\<Cd(x,y) 
for all 1,1/6 M. This is equivalent to saying that 

(2.2) f(x)<f(y) + Cd(x,y) 

for all x, y G M. Notice that a function is O-Lipschitz if and only if it is 
constant. 

For instance, if p is a point in M, then f p (x) = d(x,p) is 1-Lipschitz, 
because 

(2.3) d(x,p) < d(x,y) + d(y,p) 

by the triangle inequality. More generally, if A is a nonempty subset of M, 
and if x is a point in M, then the distance from x to A is denoted dist(x, A) 
and defined by 

(2.4) dist(x, A) = M{d(x, y) : y G A}, 

and one can check that dist(x, A) is a 1-Lipschitz function on M. If /1, f 2 are 
two real- valued functions on M which are Ci, C 2 -Lipschitz, respectively, and 
if cci, a2 are real numbers, then max(/ 1; f 2 ), min(/ l5 f 2 ) are C-Lipschitz with 
C = max(Ci, C2), and a± fi + a 2 f 2 is C-Lipschitz with C = |cki| Ci + \ol 2 \ C 2 . 



9 



Now let C be a nonnegative real number and let s be a positive real 
number. A real- valued function f(x) on M is said to be C-Lipschitz of order 
s if 

(2.5) \f(x)-f(y)\<Cd(x,y) s 
for all x, y G M, which is again equivalent to 

(2.6) f(x)<f(y) + Cd(x,y) s 

for all x, y G M. As before, /(x) is O-Lipschitz of order s if and only if f(x) 
is constant on M. 

When < s < 1, one can check that d(x, y) s is also a metric on M, which 
defines the same topology on M in fact. The main point in this regard is that 
the triangle inequality continues to hold, which follows from the observation 
that 

(2.7) (a + (3) s < a s + (3 s 

for all nonnegative real numbers a, f3. A real- valued function f(x) on M 
is C-Lipschitz of order s with respect to the metric d(x, y) if and only if 
f(x) is C-Lipschitz of order 1 with respect to d(x,y) s , and as a result when 
< s < 1 one has the same statements for Lipschitz functions of order s as 
for ordinary Lipschitz functions. 

When s > 1 the triangle inequality for d(x,y) s does not work in general, 
but we do have that 

(2.8) d(x, z) s < 2 s - 1 (d(x, y) s + d(y, z) s ) 
for all x,y,z G M, because 

(2.9) {a + (3) s < 2 s - 1 {a s + (3 s ) 

for all nonnegative real numbers a, f3. Some of the usual properties of Lip- 
schitz functions carry over to Lipschitz functions of order s, perhaps with 
appropriate modification, but it may be that the only Lipschitz functions of 
order s when s > 1 are constant. This is the case on Euclidean spaces with 
their standard metrics, and indeed a Lipschitz function of order s > 1 has 
first derivatives equal to everywhere. 

In harmonic analysis one considers a variety of classes of functions with 
different kinds of restrictions on size, oscillations, regularity, and so on, and 
these Lipschitz classes are fundamental examples. In particular, it can be 
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quite useful to have the parameter s available to adjust to the given cir- 
cumstances. There are also other ways of introducing parameters to get 
interesting classes of functions and measurements of their behavior. 

If M is the usual n-dimensional Euclidean space R n , with its standard 
metric, then one has the extra structure of translations, rotations, and dila- 
tions. If f(x) is a real- valued function on R™ which is C-Lipschitz of order s, 
f(x—u) is also C-Lipschitz of order s for each u G R n , f(Q(x)) is C-Lipschitz 
of order s for each rotation on R n , and /(r _1 x) is (C r s )-Lipschitz of order 
s for each r > 0. In effect, on general metric spaces we can consider classes of 
functions and measurements of their behavior which have analogous features, 
even if there are not exactly translations, rotations, and dilations. 

A basic notion is to consider various scales and locations somewhat in- 
dependently. In this regard, if f(x) is a real- valued function on M, x is an 
element of M, and t is a positive real number, put 

(2.10) oscOM) = sup{|/(y) - f(x)\ : y E M,d(y,x) < t}. 

We implicitly assume here that f(y) remains bounded on bounded subsets 
of M, so that this quantity is finite. For instance, / is C-Lipschitz of order 
s if and only if 

(2.11) r s osc(x,t) < C 

for all x E M and t > 0. 

Let us pause a moment and notice that 

(2.12) osc(w,r) < osc(x, t) 
when d(w, x) + r <t. Thus, 

(2.13) r-- s osc(w, r) < 2 s r s osc(x, t) 

when d(x, w)+r < t and r > t/2. This is a kind of "robustness" property of 
these measurements of local oscillation of a function / on M. In particular, to 
sample the behavior of / at essentially all locations and scales, it is practically 
enough to look at a reasonably-nice and discrete family of locations and 
scales. For example, one might restrict one's attention to radii t which are 
integer powers of 2, and for a specific choice of t use a collection of points in 
M which cover suitably the various locations at that scale. 

Instead of simply taking a supremum of some measurements of local os- 
cillation like this, one can consider various sums of discrete samples of this 
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sort. This leads to a number of classes of functions and measurements of 
their behavior. One can adjust this further by taking into account the rela- 
tion of some location and scale to some kind of boundaries, or singularities, 
or concentrations, and so on. 

Of course one might also use some kind of measurement of sizes of subsets 
of M. This could entail diameters, volumes, or measurements of capacity. 
There are also many kinds of local measurements of oscillation or size that 
one can consider. As an extension of just taking suprema, one can take 
various local averages, or averages of powers of other quantities. Of course 
one can still bring in powers of the radius as before. 

Even if one starts with measurements of localized behavior which are not 
so robust in the manner described before, one can transform them into more 
robust versions by taking localized suprema or averages or whatever after- 
wards. Frequently the kind of overall aggregations employed have this kind 
of robustness included in effect, and one can make some sort of rearrange- 
ment to put this in starker relief. Let us also note that one often has local 
measurements which can be quite different on their own, but in some overall 
aggregation lead to equivalent classes of functions and similar measurements 
of their behavior. 

There are various moments, differences, and higher-order oscillations that 
can be interesting. As a basic version of this, one can consider oscillations of 
f(x) in terms of deviations from something like a polynomial of fixed positive 
degree, rather than simply oscillations from being constant, as with osc(a;, t). 
This can be measured in a number of ways. 

However, for these kinds of higher-order oscillations, additional structure 
of the metric space is relevant. On Euclidean spaces, or subsets of Euclidean 
spaces, one can use ordinary polynomials, for instance. This carries over 
to the much-studied setting of nilpotent Lie groups equipped with a family 
of dilations, where one has polynomials as in the Euclidean case, with the 
degrees of the polynomials defined in a different way using the dilations. 

These themes are closely related to having some kind of derivatives around. 
Just as there are various ways to measure the size of a function, one can 
get various measurements of oscillations looking at measurements of sizes of 
derivatives. It can also be interesting to have scales involved in a more active 
manner, and in any case there are numerous versions of ideas along these 
lines that one can consider. 
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3 Sub-Riemannian geometry 

Let n be a positive integer, and consider (2n + l)-dimensional Euclidean 
space, which we shall think of as 

(3.1) R n x R n x R. 

The case where n = 1, so that this is basically just R 3 , is already quite 
interesting for us. We shall also be interested in 

(3.2) R n x R n 

and the obvious coordinate projection from the former to the latter given by 

(3.3) (x,V,s) ■->• (x,y). 
Define a smooth 1-form a on (J3.1|) by 

n 

(3.4) a = ds — yj dxj, 

3=1 

using the usual coordinates (x, y, s) on (|3.1|) . Thus if p = (x, y, s) is a point 
in (|3.ip . then a at p, which we denote Oirp j IS 3j linear functional on the tangent 
space to (|3.1|) at p. Of course we can identify the tangent space to (|3.1|) at p 
with R n x R" x R itself in the usual way, and if V = (v, w, u) is a tangent 
vector to (|3.1|) at p represented using this identification, then 

n 

(3.5) a p (V) = u-J2vi v r 

i=i 

We can take the derivative da of a in the sense of exterior differential 
calculus to get a 2- form on (j3.1|) . Namely, 

n 

(3.6) da = dxj A dyj. 

If one takes the wedge product of a with n copies of da, then one gets 

(3.7) (n!) ds A dx\ A dyi ■ ■ ■ dx n A dy n , 

which is a nonzero (2n+l)-form on (|3.1|) that is n factorial times the standard 
volume form. 



13 



If p = (x, y, s) is a point in ([3.1)1 . then we get a special linear subspace H p 
of the tangent space to ([3.1)1 at p which is the kernel of a p . In other words, 
Hp consists of the tangent vectors V = (v, w, u) such that 

n 

(3.8) u - J2vj v j = 0. 

Thus Hp has dimension In at every point p. 

If / is a smooth real-valued function on (|3.1|) which is never equal to 
0, then / a is also a 1-form on (j3.1j) which vanishes exactly at the tangent 
vectors in H p at a point p. By the usual rules of exterior differential calculus, 

(3.9) d(fa)=dfAa + fda. 

One can check that the wedge product of fa with n copies of d(fa) is the 
same as f n+1 times the wedge product of a with n copies of da. 

A nice feature of da is that it can be viewed as the pull-back of a 2- 
form from ([3.2)1 . Basically this simply means that da does not contain any 
(is's and the coefficients depend only on x, y. However, a contains ds as an 
important term, and a is not the pull-back of a form from ([3.2)1 . 

For each point p = (x, y, s) in ([3.1)1 . let F p denote the 1-dimensional linear 
subspace of the tangent space to ([3.1)1 at p consisting of vectors of the form 
(0, 0, t), tEH. This is the same as the subspace of tangent vectors to the 
fiber of the mapping from ([3.1)) to ()3.2[) through p, which is also the same 
as the set of tangent vectors in the kernel of the aforementioned mapping. 
Notice that F p and H p are complementary subspaces of the tangent space to 
(13.1)) . which is to say that every vector V = (v, w, u) in the tangent space at 
p can be written in a unique way as a sum of vectors in F p and H p , namely 



(3.10) V= 0,0,«-$>i«i + [v,w,Y,Vi 



v 



Let / be an interval in the real line with positive length, and which may 
be unbounded. Suppose that j(t) is a continuous function defined for t in I 
and with values in ([3.1)1 . so that 7(t) defines a continuous path in (|3.1)1 . Let 
us assume that ^(t) is continuously-differentiable on J, so that the derivative 

(3-11) j(t) = J t l(f) 
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exists for each t G I and is continuous on I. More precisely, if t is in the 
interior of I, then the derivative is taken in the usual sense, while if t is 
an element of / which is also an endpoint of /, then one uses a one-sided 
derivative. 

We say that this path ^(t) is horizontal if 

(3.12) 7(*)e% 

for each t E I. This is equivalent to asking that 

(3.13) a(7)=0 
along the curve, which is to say that 

(3.14) « 7{t) (7(t)) = 

for all t E I. If we write j(t) more explicitly as (x(t),y(t),s(t)), then this 
becomes 

n 

(3.15) *(t) = £l/j(*)*i(*) 
for all t El. 

Let 7o(t) be the projection of j(t) into ()3.2p . which means that 7o(t) = 
(x(t),y(t)). If j(t) is a horizontal curve in (|3.1|) . then j(t) is uniquely deter- 
mined by its projection jo(t) and the value of j(t) at a single point t = t G I. 
Indeed, if j(t) = (x(t),y(t), s(t)) is horizontal, then s(t) can be expressed in 
terms of x(t) and y(t) as before, and s(t) is determined by this and the value 
of s(t) at one point to- 

Conversely, suppose that we are given a continuously-differentiable curve 
7o(i) in (|3.2|) defined for t E I. Also let t be an element of /, and suppose 
that So is some real number. Then there is a continuously-differentiable curve 
7(t) = (x(t), y(t), s(t)), t E I, in ()3.1|) whose projection into ()3.2|) is equal to 
7o(i) and such that s(t ) = s - Namely, one can compute what s(t) should 
be in terms of x(t) and y(t), and integrate that using also s(t ) = s to get 
s(t) for all t G /. 

It is not too difficult to show that each pair of points p, q in (|3.1|) can 
be connected by a continuously-differentiable curve which is horizontal. One 
can start by taking the prjections of these two points to get two points in 
()3.2j) which can be connected by all sorts of curves. These curves can be lifted 
to horizontal curves in (J3.1|) which begin at p, as in the preceding paragraph. 
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In general a lifted path like this may not go to q, and one can check that 
there are plenty of choices of paths in (|3.2j) for which the lifting will have 
this property. 

This leads to a very interesting kind of geometry in (j3.1|) . Namely, one 
defines the distance between two points p, q to be the infimum of the lengths 
of the horizontal paths joining p to q. Of course the standard Euclidean 
metric on (J3.1j) can be defined by minimizing the lengths of all paths from 
p to q. The restriction to horizontal paths makes the distance increase, 
although one can check that the resulting metric is still compatible with the 
usual topology on (j3.1|) . 

4 Hyperbolic groups 

Let T be a group, and let F be a finite set of elements of T. By a word over 
F we mean a formal product of elements of F and their inverses. Every word 
over F determines an element of the group T, simply using the group opera- 
tions. The "empty word" is considered a word over F, which corresponds to 
the identity element of V. 

If z is a word over F, then the length of z is denoted L(z) and is the 
number of elements of F such that they or their inverses are used in z, 
counting multiplicities. A word z is said to be irreducible if it does not 
contain an a G F next to a -1 , i.e., so that all obvious cancellations have 
been made. If a word z over F corresponds to the identity element of T, then 
z is said to be trivial. 

A finite subset F of a group T is a set of generators of V if every element 
of T corresponds to a word over F. A group is said to be finitely- generated if 
it has a finite set of generators. Let us make the convention that a generating 
set F of a group V should not contain the identity element of Y. 

Suppose that T is a group and that F is a finite set of generators of Y. The 
Cayley graph associated to Y and F is the graph consisting of the elements 
of T as vertices with the provision that 71, 7 2 in Y are adjacent if 72 = 71 a, 
where a is an element of F or its inverse. Thus this relation is symmetric in 
71 and 72 . 

A finite sequence 60, #2, ■ ■ ■ , 9k of elements of Y is said to define a path if 
9j, 9j + i are adjacent in the Cayley graph for each j, < j < k — 1. The 
length of this path is defined to be k. We include the degenerate case where 
k = 0, so that a single element of Y is viewed as a path of length 0. 
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If 0, ip are elements of T, then the distance between and ip is defined 
to be the shortest length of a path that connects to ip. In particular, note 
that for any two elements 0, in T there is a path which starts at and 
ends at ip. To see this, one can write ip as /3 for some /3 in T, and then use 
the assumption that Y is generated by F to obtain a path from to ip one 
step at a time. 

These definitions are invariant under left translations in V. In other 
words, if 5 is any fixed element of T, then the tranformation 7 1— > 5 7 on T 
defines an automorphism of the Cayley graph, and it also preserves distances 
between elements of T. This follows from the definitions, since the Cayley 
graph was defined in terms of right-multiplication by generators and their 
inverses. 

A basic fact is that this definition of distance does not depend too strongly 
on the choice of generating set F, in the sense that if one has another finite 
generating set, then the two distance functions associated to these generating 
sets are each bounded by a constant multiple of the other. This is not difficult 
to check, by expressing each generator in one set as a finite word over the 
other set of generators. There are only a finite number of these expressions, 
so that their maximal length is a finite number. 

Let us continue with the assumption that we have a fixed generating set 
F for the group T. Suppose that R is a finite set of words over F. We say 
that R is a set of relations for V if every element of R is a trivial word. The 
inverses of elements of R are also then trivial words, as well as conjugates of 
elements of R. That is, if r is an element of R and u is any word over F, then 
uru' 1 is the conjugate of r by u, and it is a trivial word since r is. Products 
of conjugates of elements of R and their inverses are trivial words too, as 
well as words obtained from these through cancellations, i.e., by cancelling 
a a" 1 and a" 1 a whenever a is an element of F . The combination of F and 
a set R of relations defines a presentation of Y if every word over F which 
corresponds to the identity element of Y can be obtained in this manner. The 
empty word is viewed as being equal to the empty product of relations, so 
that it is automatically included. A group Y is said to be finitely-presented 
if there is a presentation with a finite set of generators and a finite set of 
relations. For instance, if Y is the free group with generators in F, then one 
can take R to be the set consisting of the empty word, and this defines a 
presentation for Y. 

Let us call a word over F trivial if it corresponds to the identity element 
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of T. Suppose that w is a trivial word, with 

(4.1) w = p 1 fr---P n , 

where each Pi is an element of F or an inverse of an element of F. This leads 
to a path 9 ,9i, . . . ,9 n , where 9 is the identity element of T and 9j is equal 
to PiP 2 ■ ■ ■ Pj when j > 1. Because w is a trivial word, 9 n is also the identity 
element in T, which is to say that this path is a loop that begins and ends 
at the identity element. 

Fix a finite set R of relations, so that F and R give a presentation for V. 
Let if be a trivial word over F which is also irreducible. Define A(w) to be the 
smallest nonnegative integer A for which there exist relations ri, r%, . . . , 
in R, integers 61, 62, • • • , bk, and words u\, 112, ■ ■ ■ , Wfc over F such that the 
expression 

(4.2) u 1 r b 1 1 u^ 1 u 2 r b 2 2 U2 1 ■ ■ ■ u k r b ^u^ 1 
can be reduced to w after cancellations as before, 

k 

(4.3) 5>( % )<,4, 

3=1 

and 

(4.4) £N£fo) 2 <A 

i=i 

Here if z is a word over F and 6 is an integer, then z b is defined in the obvious 
manner, by simply repeating z b times when b > 0, or repeating z~ l —b times 
when b < 0. A representation of this type for w necessarily exists, since F 
and R give a presentation for V. 

The group V is said to be hyperbolic if there is a nonnegative real number 
C > so that 

(4.5) A(w)<C L(w) 

for all irreducible trivial words w. The property of hyperbolicity does not 
depend on the choice of finite presentation for T, and in fact there are other 
definitions for which one only needs to assume that T is finitely generated, and 
the existence of a finite presentation is then a consequence. This characteri- 
zation of hyperbolicity is discussed in Section 2.3 of [Grol . Some examples 
of hyperbolic groups are finitely-generated free groups and the fundamen- 
tal groups of compact connected Riemannian manifolds without boundary 
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and strictly negative curvature. In particular, this includes the fundamental 
group of a closed Riemann surface with genus at least 2. 

Let M be a nonempty set. A nonnegative real- valued function d(x, y) 
on the Cartesian product M x M is said to be a quasimetric if d(x, y) — 
exactly when x = y, d(x, y) = d(y, x) for all x, y G M, and 



for some positive real number C and all x,y, z G M. If this last condition 
holds with C = 1, then y) is said to be a metric on M. 

If d(x, y) is a quasimetric on M and a is a positive real number, then 
d(x,y) a is also a quasimetric on M. If d(x,y) is a metric on M and a is a 
positive real number such that a < 1, then d(x,y) a is a metric on M too. 
These statements are not difficult to verify. There is a very nice result going 
in the other direction, which states that if d(x, y) is a quasimetric on M, then 
there are positive real numbers C, 5 and a metric p(x, y) on M such that 



for all x,y G M. See jMacSlj . 

If y) is a quasimetric on M, then one has many of the same basic 
notions as for a metric, such as convergence of sequences, open and closed 
sets, dense subsets, and so on. For instance, it makes sense to say that M 
is separable with respect to a quasimetric if it has a subset which is at most 
countable and also dense, and one can define the topological dimension for M 
as in (HurWj. The diameter of a subset can be defined in the usual manner 
using the quasimetric, and this permits one to define the Hausdorff dimension 
of a nonempty subset of M. A famous result about metric spaces is that the 
topological dimension is always less than or equal to the Hausdorff dimension. 
See Chapter VII of [H urWj . This does not work for quasimetrics in general, 
and it cannot possibly work. For if (M,d(x,y)) is a quasimetric space with 
Hausdorff dimension s and a is a positive real number, then (M, d(x, y) a ) has 
Hausdorff dimension s/a, while the topological dimension of (M,d(x,y) a ) is 
the same as that of (M, d(x, y)). 

Let T be a finitely-presented group which is hyperbolic. Associated to F 
is a space S which is a kind of "space at infinity" or ideal boundary of F, 
consisting of equivalence classes of asymptotic directions in F. This space is 
a compact Hausdorff topological space of finite dimension, as on pi 10-1 of 
|Crolj , and it contains a copy of the Cantor set as soon as it has at least three 



(4.6) 




(4.7) 



C'- 1 p(x,y) 5 <d(x,y)<C'p(x,y) s 
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elements. If S has at most two elements, then T is said to be elementary. For 
a free group with at least two generators the space at infinity is homeomorphic 
to a Cantor set, while Z, a free group with one generator, is elementary and 
has two points in the space at infinity. If T is the fundamental group of a 
closed Riemann surface of genus at least 2, then E is homeomorphic to the 
unit circle in R 2 . More generally, if T is the fundamental group of a compact 
n-dimensional Riemannian manifold without boundary with strictly negative 
curvature, then E is homeomorphic to the unit sphere S n_1 in R". 

Actually, the space at infinity is defined for any hyperbolic metric space 
in |Grolj . and this can be specialized to a hyperbolic group. It is often 
preferable to work with metric spaces which are "geodesic" , in the sense that 
any pair of points can be connected by a curve whose length is equal to the 
distance between the two points. It is often useful to think of a hyperbolic 
group as acting on a geodesic hyperbolic metric space by isometries, and to 
use that to study the space at infinity. 

It does not customarily seem to be said this way, but I think it is fair to say 
that what are basically defined on the space at infinity are quasimetrics, at 
least initially. More precisely, it is more like the logarithm of a quasimetric, 
or, in other words, there is a one-parameter family of quasimetrics which 
are powers of each other. A few years ago Gromov casually asked about 
approximating quasimetrics by metric in the manner described before, and 
this is presumably the reason. In Section 7.2 of |Grol| one takes a different 
route, in effect compactifying a geodesic hyperbolic metric space by looking at 
modified measurements of lengths of curves which take densities into account, 
densities that decay at infinity in a suitable manner. 

In nice situations, such as hyperbolic groups, and universal coverings 
of compact Riemannian manifolds without boundary and strictly negative 
curvature in particular, there are doubling conditions on the space at infinity. 
Compare with |Panlj . There are also interesting measures around, as in 

(DoSj. 

A well-known result of Borel |Borlt Rag| says that simply-connected sym- 
metric spaces can be realized as the universal covering of a compact manifold. 
If the symmetric space is of noncompact type and rank 1 , it has negative cur- 
vature, and thus the fundamental group of the compact quotient, which is 
a uniform lattice in the group of isometries of the symmetric space, is a 
hyperbolic group. If the symmetric space is a classical hyperbolic space of 
dimension n, with constant negative curvature, then the space at infinity can 
be identified with a Euclidean sphere of dimension n — 1. If the symmetric 
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space is a complex hyperbolic space of complex dimension m, then the space 
at infinity can be identified topologically with a Euclidean sphere of real di- 
mension 2m — 1, but the geometry corresponds to a sub-Riemannian space 
when m > 2. For other symmetric spaces of noncompact type and rank 
1, one again obtains topological spheres of dimension 1 less than the real 
dimension of the symmetric space, with more complicated sub-Riemannian 
structures. 

5 ]9-Adic numbers 

Let Z denote the integers, Q denote the rational numbers, and let | • | denote 
the usual absolute value function or modulus on the complex numbers C. On 
the rational numbers there are other absolute value functions that one can 
consider. Namely, if p is a prime number, define the p-adic absolute value 
function \ ■ \ p on Q by \x\ p = when x = 0, \x\ p = p~ k when x = p h m/n, 
where k is an integer and m, n are nonzero integers which are not divisible 
by p. One can check that 
(5-1) \xy\ p = \x\ p \y\ p 

and 

(5.2) \x + y\ p < \x\ p + \y\ p 
for all x, y G Q, and in fact 

(5.3) \x + y\ p < maxOlp, \y\ p ) 
for all x, y G Q. 

Just as the usual absolute value function leads to the distance function 
\x— y\, the p-adic absolute value function leads to the p-adic distance function 
\x — y\ p on Q. With respect to this distance function, the rationals are not 
complete as a metric space, and one can complete the rationals to get a larger 
space Q p . This is analogous to obtaining the real numbers by completing the 
rationals with respect to the standard absolute value function. By standard 
reasoning the arithmetic operations and p-adic absolute value function extend 
from Q to Q p , with much the same properties as before. In this manner one 
gets the field of p-adic numbers. As a metric space, Q p is complete by 
construction, and one can also show that closed and bounded subsets of Q p 
are compact. This is also similar to the real numbers. 



21 



Note that the set Z of integers forms a bounded subset of Q p , in con- 
trast to being an unbounded subset of R. In fact, each integer has p-adic 
absolute value less than or equal to 1. There are general results about ab- 
solute value functions on fields to the effect that if the absolute values of 
integers are bounded, then they are less than or equal to 1, and the absolute 
value function satisfies the ultrametric version of the triangle inequality. See 
p28-9 of |Gouj . In this case the absolute value function is said to be non- 
Archimedian. If the absolute values of integers are not bounded, as in the 
case of the usual absolute value function, then the absolute value function is 
said to be Archimedian. 

A related point is that the set Z of integers is a discrete subset of the 
real numbers. It has no limit points, and in fact the distance between two 
distinct integers is always at least 1. This is not the case in Q p , where Z is 
bounded, and hence precompact. Now consider Z[l/p], the set of rational 
numbers of the form p k n, where k and n are integers. As a subset of R, this 
is unbounded, and it also contains nontrivial sequences which converge to 0. 
Similarly, as a subset of Q p , it is unbounded and contains nontrivial sequences 
which converge to 0. As a subset of Q; when I ^ p, Z[l/p] is bounded and 
hence precompact again. Using the diagonal mapping x i— > (x,x), one can 
view Z[l/p] as a subset of the Cartesian product R x Q p . In this product, 
Z[l/p] is discrete again. Indeed, if a = p k b is a nonzero element of Z[l/p], 
where k, b are integers and b is not divisible by p, then either \a\ p > 1, or 
\a\ p < 1, in which case k > 0, and \a\ > 1. 

Similarly, SL n (Z), the group of n x n invertible matrices with entries in 
Z and determinant 1, is a discrete subgroup of SX n (R), the analogously- 
defined group of matrices with real entries. One can define SL n (Z[l/p\) and 
SL n (Q p ) in the same manner, and using the diagonal embedding x t— > (x,x) 
again, SL n (Z[l/p]) becomes a discrete subgroup of the Cartesian product 
SL n (R) x SX n (Qp). 

There are fancier versions of these things for making Q discrete, using 
"adeles" , which involve p-adic numbers for all primes p. See |Weij . 

Now let us turn to some aspects of analysis. With respect to addition, 
Q p is a locally compact abelian group, and thus has a translation-invariant 
Haar measure, which is finite on compact sets, strictly positive on nonempty 
open sets, and unique up to multiplication by a positive real number. As in 
|Taij , there is a rich Fourier analysis for real or complex- valued functions on 
Q p , or Qp when n is a positive integer. 

Instead one can also be interested in Q p - valued functions on Q p , or on 
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a subset of Q p . It is especially interesting to consider functions defined by 
power series. As is commonly mentioned, a basic difference between Q p 
and the real numbers is that an infinite series J2 On converges if and only 
if the sequence of terms a n tends to as n tends to infinity. Indeed, the 
series converges if and only if the sequence of partial sums forms a Cauchy 
sequence, and this implies that the terms tend to 0, just as in the case of 
real or complex numbers. For p-adic numbers, however, one can use the 
ultrametric version of the triangle inequality to check that the partial sums 
form a Cauchy sequence when the terms tend to 0. In particular, a power 
series a n x n converges for some particular x if and only if the sequence of 
terms a n x n tends to 0, which is to say that |a n | p tends to as a sequence 
of real numbers. 

Suppose that Y^=q a n x n is a power series that converges for all x in Q p , 
which is equivalent to saying that |a n | p r n converges to as a sequence of real 
numbers for all r > 0. Thus we get a function f(x) defined on all of Q p , 
and we would like to make an analogy with entire holomorphic functions of a 
single complex variable. This is somewhat like the situation of starting with 
a power series that converges on all of R, and deciding to interpret it as a 
function on the complex numbers instead. 

In fact, let us consider the simpler case of a power series with only finitely 
many nonzero terms, which is to say a polynomial. As in the case of complex 
numbers, it would be nice to be able to factor polynomials. The p-adic num- 
bers Q p are not algebraically closed, and so in order to factor polynomials 
one can first pass to an algebraic closure. It turns out that the p-adic abso- 
lute value can be extended to the algebraic closure, while keeping the basic 
properties of the absolute value. See |Cas|IGou| . The algebraic closure is not 
complete in the sense of metric spaces with respect to the extended absolute 
value function, and one can take a metric completion to get a larger field 
to which the absolute values can be extended again. A basic result is that 
this metric completion is algebraically closed, so that one can stop here. Let 
us write C p for this new field, which is algebraically closed and metrically 
complete. 

Once one goes to the algebraic closure, one can factor polynomials. On C p 
one has this property and also one can work with power series. In particular, 
since the power series J2™=o a n x n converges on all of Q p , it also converges 
on all of C p , so that f(x) can be extended in a natural way to C p . For that 
matter, one can start with a power series that converges on all of C p , where 
the coefficients are allowed to be in C p , and not just Q p . 
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Under these conditions, the function / can be written as a product of an 
element of C p , factors which are equal to x, and factors of the form (1 — Xj x), 
where the X/s are nonzero elements of C p . In other words, the factors of 
x correspond to a zero of some order at the origin, while the factors of the 
form (1 — Xj x) correspond to zeros at the reciprocals of the X/s. If f(x) is 
a polynomial, then there are only finitely many factors, and this statement 
is the same as saying that C p is algebraically closed. In general, each zero 
of f(x) is of finite order, and there are only finitely many zeros within any 
ball of finite radius in C p . Thus the set of zeros is at most countable, and 
this condition permits one to show that the product of the factors mentioned 
above converges when there are infinitely many factors. 

This representation theorem can be found on pll3 of jCasj and on p209 of 
|Gouj . It is analogous to classical results about entire holomorphic functions 
of a complex variable, with some simplifications. In the complex case, it is 
necessary to make assumptions about the growth of an entire function for 
many results, and the basic factors often need to be more complicated in 
order to have convergence of the product. See |Ahl| IVeej concerning entire 
holomorphic functions of a complex variable. 
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